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Abstract — It is well known, that the Alamouti scheme is 
the only space-time code from orthogonal design achieving the 
capacity of a multiple-input multiple-output (MIMO) wireless 
communication system with nr = 2 transmit antennas and 
ur — 1 receive antenna. In this work, we propose the n-times 
stacked Alamouti scheme for ut — 2n transmit antennas and 
show that this scheme achieves the capacity in the case otnu — 1 
receive antenna. This result may regarded as an extension of 
the Alamouti case. For the more general case of more than one 
receive antenna, we show that if the number of transmit antennas 
is higher than the number of receive antennas we achieve a 
high portion of the capacity with this scheme. Further, we show 
that the MIMO capacity is at most twice the rate achieved with 
the proposed scheme for all SNR. We derive lower and upper 
bounds for the rate achieved with this scheme and compare it 
with upper and lower bounds for the capacity. In addition to the 
capacity analysis based on the assumption of a coherent channel, 
we analyze the error rate performance of the stacked OSTBC 
with the optimal ML detector and with the suboptimal lattice- 
reduction (LR) aided zero-forcing detector. We compare the error 
rate performance of the stacked OSTBC with spatial multiplexing 
(SM) and full-diversity achieving schemes. Finally, we illustrate 
the theoretical results by numerical simulations. 



I. Introduction 

Recent information theoretic results have demonstrated that 
the ability of a system to support a high link quality and 
higher data rates in the presence of Rayleigh fading improves 
significantly with the use of multiple transmit and receive 
antennas [1], [2]. Since then there has been considerable 
work on a variety of schemes [3] which exploit multiple 
antennas at both the transmitter and receiver in order to either 
obtain transmit and receive diversity and therefore increase 
the reliability of the system, e.g., orthogonal space-time block 
codes (OSTBC) and space-time trellis codes [4]-[6] or achieve 
the theoretical bounds [7] derived in [1], [2]. Interested readers 
are referred to [3], where a detailed analysis of different 
schemes is given. 

The performance of OSTBC with respect to mutual infor- 
mation has been analyzed (among others) in [8]-[ll] and it 
was shown that the capacity is achieved only in the case of 
Ut = 2 transmit, the well known Alamouti scheme [5], and 
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Ur — 1 receive antennas due to the rate loss inherent in 
OSTBC with higher number of transmit antennas. Recently, 
it was shown in [12] that due to this rate loss, OSTBC 
with odd number of antennas are always outperformed by 
OSTBC with even number of antennas, restricting even more 
the deployment of OSTBC. On the one hand, we have the 
OSTBC with low complexity and low rates. On the other 
hand, we have the space-time trellis codes, which achieve 
higher spectral efficiency in addition to high performance with 
respect to frame error rates. However, the decoding complexity 
of space-time trellis codes is increasing exponentially with 
the number of transmit antennas and the transmission rate. In 
order to achieve higher spectral efficiency combined with low 
complexity maximum likelihood detectors, [13]-[17] designed 
quasi-orthogonal space-time block codes (QSTBC) with trans- 
mission rate one for more than two transmit antennas. 

Other approaches aimed at reducing the decoding complex- 
ity of space-time trellis codes. For instance, a layered space- 
time architecture was proposed in [18], where the transmit 
antennas were partitioned into two-antenna groups and on each 
group space-time trellis codes were used as component codes. 
In order to further decrease the complexity of this layered 
space-time architecture, [19]-[21] used the Alamouti scheme 
as component code for each group in combination with a sub- 
optimal successive group interference suppression detection 
strategy. The outage probability of this scheme was analyzed 
in [22] for tit > nn and an upper bound was derived. An 
asymptotic analysis of the rate achievable with this scheme is 
performed in [23]. For n — 2, this transmission scheme is also 
referred to as double-space-time transmit diversity (DSTTD) 
and was proposed as one possible candidate for high speed 
downlink packet access (HSDPA) in 3GPP and beyond [24]. 

It is obvious that reducing the computational complexity 
of the detector without sacrificing much performance is an 
important issue. There is a huge amount of suboptimal de- 
tectors with low complexity in the literature, linear detectors 
like zero-forcing (ZF) or minimum mean square error (MMSE) 
and nonlinear detectors like e.g. VBLAST [25]. Unfortunately, 
these detectors significantly sacrifice performance in terms of 
bit-error-rate (BER). Recently, lattice reduction (LR) aided de- 
tection in combination with suboptimal detectors was proposed 
by Yao and Wornell in order to improve the performance of 
multi antenna systems [26]. The lattice reduction algorithm 
proposed in [26] is optimal, but works only for MIMO systems 
with two transmit and two receive antennas. The impact of 
receive antenna correlation on the performance of LR-aided 
detection was analyzed in [27]. In [28], the work of [26] was 
extended to systems with more transmit and receive antennas. 
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using the sub-optimal LLL [29] lattice reduction algorithm. 
In [30], the LR-aided schemes in [28] were adopted to the 
MMSE criterion. Note that the error rate curves of all these 
LR detectors are parallel to those for maximum Ukelihood 
(ML) detection with only some penalty in power efficiency. 

In this work, we show that the stacked Alamouti scheme 
is capable to achieve the capacity in combination with the 
optimal maximum likelihood detector for the case of nr — 2n 
transmit antennas and riR = 1 receive antennas. This was also 
shown for the basic Alamouti scheme with ut — 2 and nn = 
1 [8]. Our result may therefore be regarded as an extension of 
the Alamouti scheme to ut > 2. Furthermore, we show that in 
the case of more than one receive antenna and if ut > the 
stacked Alamouti scheme is capable to achieve a significant 
portion of the capacity and approaches the capacity if ut ^ 
Ufi. For any tit, nn, we show that the MIMO capacity is at 
most twice the rate achieved with the proposed scheme for all 
SNR. However, achieving high portions of the capacity does 
not guarantee good performance in terms of error probability. 
Thus, we compare the error-rate performance of the proposed 
scheme with spatial multiplexing (SM), a rate oriented space- 
time transmission schemes which achieve a high portion of 
the capacity of MIMO systems, and with the aforementioned 
diversity-oriented QSTBC by deploying LR-aided linear ZF 
and ML detectors at the receiver, respectively. 

The remainder of this paper is organized as follows. In 
Section [III we introduce the system model and establish the 
notation. The structure of the stacked Alamouti scheme and 
the equivalent channel model are shown in section |lll] The 
analysis of the mutual information is presented in section |IV] 
LR-aided linear ZF detection is shortly described in section 
|V] including the analysis of the probability density function of 
the condition number of the equivalent channel generated by 
the different transmission schemes (SM,QSTBC, and stacked 
OSTBC). Section |yi] provides simulation results, followed by 
some concluding remarks in Section IVIII 

II. System model 

We consider a system with riT transmit and nn receive 
antennas. Our system model is defined by 

Y = G„,H^ + N, (1) 

where G„y is the (T x tit) transmit matrix, Y = 
[yi,...,y„^] is the (T x n/?) receive matrix, H = 
[hi, ... , h„y] is a {niiXriT) matrix characterizing the coherent 
channel, and N = [ni,...,n„^] is the complex (T x nps) 
white Gaussian noise (AWGN) matrix, where an entry {nti] 
of N (1 < i < npi) denotes the complex noise at the ith 
receiver for a given time t{l <t <T). The real and imaginary 
parts of nti are independent and A/'(0,riT/(2SNR)) distributed. 
An entry of the channel matrix is denoted by {/ly }. This 
represents the complex gain of the channel between the jxh 
transmit (1 < j < nr) and the ith receive (1 < i < nn) 
antenna, where the real and imaginary parts of the channel 
gains are independent and normal distributed random variables 
with A/'(0,l/2) per dimension. The channel matrix is assumed 
to be constant for a block of T symbols and changes indepen- 
dently from block to block. The average power of the symbols 



transmitted from each antenna is normalized to be i/riT, so 
that the average power of the received signal at each receive 
antenna is one and the signal-to-noise ratio (SNR) is p. It 
is further assumed that the transmitter has no channel state 
information (CSI) and the receiver has perfect CSI. 

III. Code construction 

A space time block code is defined by its transmit matrix 
G„y with entries {I'jjj^i, which are elements of the vector 
X = [xi, . . . , Xp]^ with xi, . . . , S C, where C C C denotes 
a complex modulation signal set with unit average power, 
e.g. Af-PSK.. The rate i? of a space-time code is defined 
as i? = p/T. In this paper, we focus on the rate 717-/2 
stacked Alamouti scheme. Starting with the well known (basic) 
Alamouti scheme [5] for ut — 2 transmit antennas 



G2(a:i,X2) 



Xi 



X2 



the transmit matrix of the stacked Alamouti scheme with ut = 
2n is constructed in the following way 

= [G2(a;i, a;2), G2(a;3, ^4), . . 

Example 3.1: For the case of n 
antennas we have 



2, i.e. riT — 4: transmit 



Gi{{x,}U 



Xi 



X2 X3 



X4 



1^2 Xi X4 x^ 

which is also referred to as DSTTD [24]. 

After some manipulations (particularly complex- 
conjugating) the system model in ^ can be rewritten 

as 

- ' (2) 



Xa 



where y', n' £ 
channel equals 

H' 



y = H'x + n 

and H' e C^"^^^"^ 



The equivalent 



[(H;)^,...,(H:f, 

where is given as 
where 



' '^i,nT-l\ 



(3) 



H' 



hi 



-h 



IV. Mutual Information 

The instantaneous capacity / of a MIMO system with ut 
transmit and ur receive antennas is given as [1], [2] 



I = log2 det 



riT 



(4) 



In the following two subsections, we derive lower and upper 
bounds for both the ergodic capacity and the average rate 
achievable with the proposed stacked scheme in order to yield 
lower and upper bounds on the ratio of the ergodic capacity to 
the average rate of the stacked OSTBC. In the third subsection, 
we characterize the absolute loss of the average rate of the 
stacked OSTBC to the ergodic capacity. 
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A. Upper bounds on the ergodic capacity and the average rate 
of stacked OSTBC 

By applying the trace-determinant inequality det(A)^/" < 
itr(A), we arrive at a simple upper bound on the instanta- 
neous capacity given as 



I <Iub = L log2 1 



riT TlH 



j=l 1=1 



(5) 



where L is equal to L = vam(nT-,nji). Averaging the upper 
bound in (|5]i over all channel realizations results in [31] (C = 
E[/] denotes ergodic capacity) 



C < Cub — IE [lub] — 



L 



ln(2) 



7iTnR — k—l 



(6) 



T 1 



V p 

{nrnji - k), . 

P J 

Note that for high SNR, the slope of the upper bound is equal 
to L. In addition to this upper bound, we compare the rate 
achieved with the stacked scheme with the following upper 
bound 



C < Cjon = l0g2 




(7) 



derived in [32] by using Jensen's inequality, where K = 

max(nT, nn). 

In the following, we analyze the performance of the stacked 
scheme with respect to mutual information and derive upper 
bounds for the average rate of the stacked scheme. We first 
analyze the case of n/? = 1 receive antennas and then 
generalize the analysis to the case of arbitrary number of 
receive antennas. 

1) Case TiR = 1: In case of ur = 1, the achievable rate of 
the stacked Alamouti scheme is 

isA = ^logadct fi„, + -^(h;)«h; 

2 V "T 

Using the determinant equality dct(I + AB) = dct(I + BA) 
after some manipulations we arrive at 



I. 



log2 




(8) 



which equals the capacity of a MIMO system with tit transmit 
and TiR = 1 receive antennas [1], i.e. as long as nn = 
1, the capacity is achieved for arbitrary n — nT/2. Note 
that in [3, p. 199] a Taylor series expansion is performed 
for the capacity and the mutual information achievable with 
certain schemes such as the stacked OSTBC. After comparing 
the first two expansion coefficients (the linear term and the 
second order coefficients) it is shown that the stacked OSTBC 
reaches second-order capacity for nj^ = 1, i.e. the second- 
order coefficient of the mutual information of the stacked 
OSTBC is equal to the second-order coefficient of the capacity. 
Although essential features of the mutual information can 
be already seen from the first and second-order coefficients 



(especially at low SNR), our result above may regarded as 
more general, since the exact capacity and mutual information 
expressions are analyzed. Further note that the result above 
may be regarded as an extension of the results in [8]. There it 
was shown, that the basic Alamouti scheme with tit — 2 and 
nji = 1 achieves the capacity. 

2) Case of tit — ^ and nn — 2 (DSTTD): In the case of 
riT — 4: transmit and nn = 2 receive antennas, the equivalent 
channel is given by 



H' 



hii 


hi2 


hi3 


^14 


— h* 


h* 
"11 


h* 
"14 


h* 
"13 


^21 


h22 


h23 


^24 


"-22 


h* 
"21 


"24 


h* 
"23 


in this 


case is 


given 


as 



/ 

I. 





' Ai 





ai 


a2 


\ 


_p_ 





Ai 




a\ 




riT 


a{ 


—a2 


A2 









_ a*2 


ai 





A2 





where Yl\=\ "i = ^11^21 



/114/124, and a2 = -/^ll^22- 
Fischer's inequality 



21 " 



'^113^24 



/ll2^22 ^" ^13^23 ^ 

^14^23- Using 



det 



A 
B 



BH 
D 



< det (A) det(D) 



IsA < log2 1 + — Ai 1 + ^A2 



yields 



TlT J \ "T 

By using the arithmetic-geometric inequality, we arrive at 



<21og2 ( 1 + ^||H 



This upper bound equals to twice the rate of a full code rate 
OSTBC for riT — 4 transmit and — 2 receive antennas 
with a power penalty of 3 dB. In this particular case a more 
precise statement can be made due to the following strict form 
of Fischer's inequality [33] 

A BH 



Lemma 4.1: Let P = 
nonempty) be positive definite. 



B 

Then 



D 



(A, D square. 



B has full rank ^ det P < (det A) (det D) 
Proof: Let R ;^ denote positive definiteness, and R >- 
S defined by (R - S) 0. Then [34, 7.7.6] P ^ (A ^ 
0,D ^ BA^^B^). Thus for arbitrai-y B holds D - (D - 
BA^^B^) = BA^^B^ ^ and becomes strict if B has 
full rank. Since (O -< S R =^ det S < det R) we obtain 
detP = (det A)(det[D - BA^^B^]) < (det A)(detD), if 
B has full rank. ■ 
From det B = |q:i P + \(y-2\'^ it follows, that apart from the set 
of events {ai = 012 = 0} of measure zero, B has full rank, 
thus the upper bound for IsA is strict with probability one. 
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3) Case of arbitrary nn: The available portion of the 
mutual information achievable with n/j > 1 for the stacked 
Alamouti scheme is 



IsA = ^logsdet f + -^(H')^H' 
Following the derivation above for arbitrary nn results in 



(9) 



2 \ utLi 



(10) 



where Li = mm{nT,2nii). By averaging ( fTOl i over all 
channel realizations, an upper bound on the average rate 
R^A — '^[^sa] of the stacked Alamouti scheme similar to (|6]l 
may be obtained 



21n(2) 



fc=0 ^ ^ 



(11) 



e 2p r 1 - (nTnR ~ k), 



2p 

which can be approximated using log2(l + x) « log2(a;) for 
a; > 1 by 



5Z rt 



2 "°^\nTLiJ 21n(2) \ ^ p 



7 



Note that the approximation gets better for higher SNR and 
may be inaccurate for low SNR. Further note that, for high 
SNR, the slope of the upper bound (fTTT i and its approximation 
is equal to ^1/2. 

B. Lower bounds on the ergodic capacity and the average rate 
of stacked OSTBC 

Similarly to the last subsection, here we derive lower bounds 
for the ergodic capacity and the average rate of the stacked 
OSTBC. Due to the peculiar property of stacked OSTBC, 
lower bounds are obtained in the procedure for the following 
cases: (i) ut < ur, (ii) ur < ut < Inpi, (iii) 2ni^ < riT < 
Ann, and (iv) Ann < ut- 

First of all, from [35] we obtain the following lower bound 
on the ergodic capacity 




where 7 « 0.57721566 is Euler's constant. 

In order to derive an upper bound on the ratio of the ergodic 
capacity to the average rate achieved with the stacked scheme, 
we need a lower bound for the average rate of the stacked 
scheme. To this end, we rewrite (|9]l as follows 

IsA = I log2 det (inr + — (H)^H + -^(H'J^H', 

2 \ riT TlT 

(13) 

where H is the actual MIMO channel, which is obtained by 
taking the odd rows of the equivalent channel H' and Hg is 
obtained by taking the even rows of H'. The relation between 



the actual channel H and Hp is described in the following 
proposition. 

Proposition 4.1: Let He be the even and H the odd rows 
of H' given in (|2]), respectively. Then the following holds 

1) He = H*J, wherf] 



J = I-r ® 



1 

-1 



2) E [HHf ] = E [HJ^H^] = 0. 

Proof The proof is straightforward and uninformative 
and thus it is omitted. ■ 
Eq. (fTsT l can be rewritten as 



,A = - log2 [ det [Ir,^ + ^(H)^H ) X 



det I„, + -^H'e ( + ^(H)^H ) (H^)« 

y UT \ TIT J ^ 

= i log2 det (in^ + ;^(H)^h') + i X 



log2 det 



-HI I 



TlT 



TlT 



-(H)«H (H 



Since H'e (l„^ + ;^(H)^h) ^ (H^)^ is a positive 
semidefinite matrix, it follows immediately that the rate 
achieved with the stacked Alamouti is lower bounded by 



IsA > 7; log2 det ( + — 

2 \ 7lT 



(H)^H 



which is half the capacity of a MIMO system with ut transmit 
and Tiji receive antennas. 

Another lower bound is obtained for the case ut < nj^ 
by applying Minkowski's determinant inequality [34, p.482] 
(det(A + B) > (det(A)^ + det(B)i)", A ^ 0,B ^ 0) 
to ® 



Rs 



E 



ilog2det fl+^(H')^H' 

2 \ TlT 



>I^E 
- 2 



1 



log2 ( l + pdet( — (H')^H' 

' ' TlT 



TlT 



E 



log2 1 + pdet — (H^H + Hf He) 

' ' TlT 



Applying again Minkowski's determinant inequality results in 



RsA > ^E 



i/m 



log2 1 + pdet — H^H 

* ' TlT 



+pdet — HfH, 
nT 



Since He is obtained simply by conjugating and exchanging 
some elements of the actual matrix H, it can be shown that the 

'Notation: A^, A^, A* means transpose, hemiitian transpose, and 
complex conjugation, respectively 
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eigenvalues of (He)^(He) are the same as the eigenvalues of 
H^(H). Therefore, the lower bound is equal to 



RsA > 



log2 I 1 + pexpln I 2det ( — H^H 



Since log2(l + ce^) is a convex function in x for c > and by 
applying Jensen's inequality it holds that E [log2(l + ce^)] > 
log2(l + cexp(E [x])), we have 



RsA > — log2 I 1 + pexpE 



In I 2det ( —U"H 



= ^log2 (l + p2exp— E 
2 V "T 

From [35], [36], we know that 



hi ( det ( — H^H 

riT 




E 



In ( det ( — H^H 

riT 



E [In Xj] — riT In Ut , 



where the Xj are independent, distributed independent 
variables with 2{nji — j + 1) degrees of freedom. Using this 
yields 

RsA > ^ log2 + ^ exp 1^;^ E E [InX,] j j . 
With 

E[lnXj]^ij{nR-j + l), 

where ip{-) is the digamma function, which may be rewritten 
for integer arguments as follows 

x — l ^ 

iIj{x) = -7 + ^ -. 



p=i 



P 



Using this results in the following lower bound for the average 
rate of the stacked scheme. 

[case TiT < Tiji]. 

Similar steps can be pursued for ut > 4?!^ resulting in the 
following lower bound 

/ 2p ( 1 

RsA >nn log2 + - cxp X. - 

[case n-T > 4nfl] 
For the case of tit > 2nfl we rewrite ^ as 



1 



■ logo det I 



_P_ 

TiT 



H 



HH^ HH 



H 



(14) 

Since E [HH^] = from proposition 14.11 we may proceed 
as in [2] to arrive at a lower bound given as 



IsA >-J2^0gJl + ^X, 
2 '^-^ \ llT 



k=l 



where Xk are again independent, distributed independent 
variables with 2{Ki — fc + 1) degrees of freedom with Ki = 
max(2ni^, n^). By following the same line of arguments as 
in [35], we arrive at 

RsA>R-.=\t^-,, + 

[case riT > 2nfi\ 

In [23], a similar (however, looser) lower bound was derived 
for this case in order to analyze the asymptotic performance 
(with respect to p) of stacked OSTBC. 
For the case of < < 2nn we have 



RsA = : 



ilog2det fl+ A(H')^H' 

2 \ riT 

\ log2 det f I + (HH^ + HeHf ] 
\ log2 det ( h + ^HH^ + il + ^HeHf 



= E 

2 \^2' ' „y ' 2 riT 

Applying now Minkowski's determinant inequality results in 



RsA > 



log2 det I I 



riT 



(15) 



and finally 



RsA>R^I^=ltlo,, (l + ^expgi-,)) 
[case riR < tit < 2nE.]- 

The lower bound results derived in this subsection are 
summarized in Table J] on the top of the next page. 

Note that for high SNR, most of the bounds have a slope 
equal to ^1/2, which equals the slope of the upper bound ( fTTT i. 
Only for the case Ufj < tit < 2nii, the slope of the lower 
bound is equal to ^/2. In Fig. [T]on the top of the next page, 
the average rate, the upper bound (fTTl i and the lower bounds 
from Table H] for nr — and nji = 1,...,4 are depicted. 
From the Fig., we observe that the upper bound in (fTTI) and 
lower bounds track the average rate quite well. Only in the 
aforementioned case nn < ut < 2n n, the slope of the lower 
bound differs from the exact performance and the upper bound. 
Note that for n/f — 1, the upper bound coincides with the exact 
performance. 



C. Characterization of the absolute rate loss 

In this subsection, we characterize the absolute rate loss 
of the stacked OSTBC to the ergodic capacity using Fischer's 
inequality. First of all, we discuss the case of > 2njf . Note 
that the rate loss with the basic Alamouti scheme (n-r = 2) was 
also analyzed in [8], [10] using different approaches. Starting 
from ( fT4l i. applying Fischer's inequality and averaging over 
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Case 


Lower bound on RsA 


•riR <nT < 2nj{ 
2nii < ut 
Ann < riT 


^ 1o!r„ I 1 + ^ oxD ( — T"^ y-"R-J 1 _ ^ n 

^E^-.,log."(l+|^exp(Ef.71-7)) 
k Etii l°g2 (l + ;^ exp (Efir^ ^ - 7)) 

nn iog2 (1 + 1^ exp E?:f E;:f - ^ - 7)) 



TABLE I 

Lower bound on R^a for the different cases 




10 15 
SNR [dB] 



(c) nR = 3 

Fig. I. Average rates, upper bounds, and lower bounds of the stacked OSTBCs for nx = 4. 



10 15 20 

SNR [dB] 

(d) nR = 4: 



all channel realizations we arrive at 



R,A < E 



l0g2 




HH 



H 



(16) 



[case riT > 2nji\ 



i.e. as long as ht > 2nfi, the average rate of the stacked 
OSTBC is only upper bounded by the ergodic capacity. 



Proceeding similarly for the case nx < 2nR results in 

H^H H^H, 



R 



sA 



log2 det I„y H 



P_ 



HfH Hf'H 



< E 



C 



log2 [det 



< C (p.nT.np) 



[case riT < Sn^j], 
(17) 



where H is obtained by taking the odd columns of the 
equivalent channel H' and He is obtained by taking the even 
columns of H'. From ( fTTI l. we observe that for nx < Inn 
the average rate of the stacked OSTBC is upper bounded by 
the ergodic capacity of a system with ^ transmit and 2n]:j 
receive antennas with a power penalty of 3 dB. 

We can characterize the gap in (fTST i and ([TtT i due to the 
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application of Fischer's inequality. For ut < 2nji, we have 
then 



.2' 2 



Rs 



-E 



l0g2 



det 


[in. + £^Wn) 




det 




H^H H^He 
Hf H Hf He 


) 



where 



Wd = 



H^H 
HfH, 



(5.6.EX.26)] tr(A2) < ||A||2 we obtain E A^fc = 2 £ < 

k=i k=i 

2 1 1 B 1 1|, . Further we have 



|B||^ = tr 




HrHr h 



IriT + — H'^H 
riT 



H 



H 



which can be interpreted as the trace of a product of two 
positive semi definite matrices P, Q. Using the fact, that 
HgHe has the same ordered eigenvalues as HH^ and the 

inequality tr(PQ) < Y., ^^k{P)MQ) [37] yields E Mfc < 

k=i 

Li and we arrive at the final bound 



Since the events of Hf H having not full rank are of measure 
zero the strict form of Fischer's inequality stated in Lemma 
14. II shows, that the gap in (fTSI l and ( [TtI i is non zero in general, 
i.e. A > 0, thus it is not possible to reach the upper capacity 
bounds. 



With 

Woff 
we can rewrite 
det ( I 



U"U H^He 
Hf H Hf He 



71 T 



_P_ 



H^H H^He 
Hf H Hf He 



det I„ 



— (Woff 

riT 



P 



W 



det In. + — Wz5 X 
' TIT 



det Ij, 



riT 



— Woff 
riT 



to arrive at 

A = --E 
2 



log2 det l„ 



In. + — Wl3 



— Woff 



Using 



yields 



det(I„5, + A) = exp ^ ln(l 
\fc=i 



Pk) 



A < 



21n(2) 



.fc=i 



where the inequality follows from Taylor series expan- 
sion X — < ln(l + x) around a; = and the 
fact that tr(A) = 0, since A has zero block ma- 
trices on its diagonal. Its off-diagonal blocks have the 



form B 



^H^H 



^H^He 

71T 



and Bp 



IriT + :;77Hg He 



^H 

71T 



Z H- 



H, respectively. Note that the 
matrices in brackets have the same eigenvalues. This implies 
that each eigenvalue of A appears twice, i.e. /i^ — 
1 < fc < ii/2. Additionally applying the inequality [34, 



A < 



ln(2) 



E 



E 

k=l 



< 



Li 



21n(2)' 



In addition to that, we have the loss between 
C(|,^,2n/j) and C {p,nT,nii). Approximating O 
and (Ell for high SNR as 



Caen (P, nr, n,) = I0& (^1 + g (-^ 




=Llog, ( 1 + -^) 

TLT J 



(18) 



and 



— 2nn-j 



> 



TLT 



TlT — — P 

j=l p=l ' 



f log J exp -A ^ E 




(19) 



where (a) follows from applying Jensen's inequality to ( fT2b . 
With ^ and the loss between C(f,^,2ni^) and 

C {p, Tlx, nfi) is quite accurately described by 



C {p,nT,nR) - C (^^, ^,2nR 



JL 



[case n-T < 2/1^]. 
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Finally, the absolute loss for tit < ^nji between the ergodic 
capacity of a MIMO system and the stacked scheme is given 
by 



L - — j log2 ( 1 + — ) < C {p,nT,nR) - RsA 



< 



21n(2) ■ V^-t)^°S2(i . „^ 



P 



The same procedure can be pursued for ut > 2n b. resulting 
in the following general characterization for any nT,ni^ 



R 



max (o, L - logj ^1 + ^ j < C {p, nr, hr) 

< 777^ + max fo, L - ^) logo ( 1 + — 
- 21n(2) V ' 2 y ^2 V TIT 

which is equal to 



sA 



< 



P 



(20) 



From (I20I 1. we observe that as long as ut > 2nR, the absolute 
loss is only a constant, which depends only on the number of 
receive antennas. In case tit < 2nR the absolute loss increases 
lineai-ly with {L- 

V. SUBOPTIMAL DETECTION AND CONDITION NUMBER 

In the previous sections, we have shown that the stacked 
OSTBC achieves significant portions of the ergodic capac- 
ity. This does not, however, guarantee good performance in 
terms of error probability, which will be investigated in this 
section. Note that in the analysis in the previous sections it 
was implicitly assumed, that an optimal maximum-likelihood 
detector is used at the receiver, which performs an exhaustive 
search over all possible transmit symbols at each detection 
step. Especially for higher number of transmit antennas, this 
becomes computationally prohibitive. If additionally high rates 
are requested, then higher order modulation sizes are necessary 
which increases the computational complexity even more. 
Thus, suboptimal detection schemes have to be employed 
reducing the detection complexity and thereby achieving 
reasonable error rate performance results. Therefore, in this 
section the impact of the suboptimal LR-aided linear ZF- 
detector on the performance of the stacked OSTBC is analyzed 
and compared to SM and QSTBC by resorting the equivalent 
channel representation. In order to apply the LR algorithm, 
the system model has to rewritten, which is done in the 
following subsections for the different transmission schemes. 
Afterwards, the LR-aided linear ZF-detection is described 
briefly. 



A. Spatial Multiplexing ( SM) 
For SM, the transmit matrix G, 



is reduced to x, since 



system model in ([U has to be rewritten as a real model [28] 
of the form 



YE = 



3?{x} 
3{x} 



1 T 



rrSM I „ 



where 



and 



Ye 





T 


■ 5R{n} " 






3{n} 



HSM 
E 



3?{H} 3{H} 
-9{H} 5R{H} 



In the following, we refer to H^*^ as the equivalent channel 
for the SM scheme. 



B. QSTBC 

Without loss of generality, in this subsection we shortly 
describe the QSTBC for = 4 transmit antennas [38]. 
To generalization to higher number of transmit antennas is 
straightforward [16]. The transmit matrix for = 4 transmit 
antennas is then given [16], [38]. 



Xl 


X2 


a;3 


X4 


X2 




xl 






— X4 


-Xl 






X3 


—X2 


-xl 



G4(X) 



After rewriting ([T]l, we arrive at (similar to the proposed 
scheme, (cf. (|2]i) 



y'3=HQx + n'5, (21) 
[(H?r,...,(H?r,...,(HQj^F and (Hf) 



hii 


h2i 


hzi 


hii 


h* 


h* 
"■li 


h* 


h* 


hsi 


hii 


hii 


-h2i 


h* 


— h* 
'hi 


h* 
'hi 


h* 



where H'^ = 
is given as 



For general ut, we have to rewrite the system model in (ISTT i 
as a real model similar to SM. For ut — 4, however, it 
is not necessary to resort to the real system model. Here, 
the system model can be decomposed such that the iterative 
optimal algorithm in [26] for a system with ut — 2 transmit 
antennas can be applied. For this we first perform channel- 
matched filtering as the first stage and noise pre-whitening as 
the second stage of preprocessing at the receiver resulting in 
two independent subsystems [39], one of which 



Yo 







Xl 






^3 



+ no , 



is only a function of the elements of x with odd index, and 
the other one is only a function of the elements of x with even 
index, 

+ He , 



'13 Jf3' 




X4 






X2 



T = 1. In order to apply the suboptimal LR for SM, the 
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where is the 2x2 equivalent channel for QSTBC, 

/3 



— / A+a 



^^,e = A = Er="iE;ii|/^..f. and 

a = Er=i 2Iin(/i*]^/ii.3 + /i* 4/1^^2)- Both subsystems can now 
be detected separately, which reduces the complexity of the 
receiver significantly. 

Lemma 5.1: In order to get the best performance with 
respect to error rates and a decoupled system with scalar input 
and scalar output as in the case of OSTBC, the columns of 
have to be orthogonal. However, the probability that this 
occurs for is zero. 

Proof: For orthogonality, it follows from the scalar 
product of the columns of that a has to be zero. But 
since the channel entries {hji} are mutually independent 
and identically distributed (i.i.d.) random complex Gaussian 
processes, the probability Pr{a = 0) is equal to the probability 
-P'-(Er=i2Ini(/i*i/ii,3 + hl^h^^2) = 0), which in turn is 
zero. From this it follows that orthogonality and therefore a 
decoupled system can not be achieved. ■ 
A disadvantage of this QSTBC is that in order to achieve 
the same transmission rate as SM, we have to compensate 
the rate loss by using a considerably higher constellation. 
But recall that higher constellations complicates amplification, 
synchronization, and detection. E.g., a transmission rate of 4 
bits/sec/Hz for a system with ut — ^ transmit antennas is 
achieved by SM with BPSK, whereas 16QAM is required 
for the code rate one QSTBC. In [14], [40] it was shown 
that QSTBC approach the capacity in case of nj^ = 1, 
which is achieved in case of the stacked OSTBC as shown 
in section IIV-AI For n/? > 1, the performance of QSTBC in 
terms of mutual information degrades severely in contrast to 
the stacked OSTBC, which achieve at least half of the capacity 
as derived in section IIV-BI 

C. Proposed scheme 

Given Q, the equivalent real signal model for the proposed 
stacked OSTBC is given as 



K{x} 
S{x} 



where 



H 



OS 



3?{H'} 
3{H'} 



-3{H'} 
3?{H'} 



D. LR-aided linear ZF Detection 



By applying the algorithm, the m x n equivalent channel 
H^; for each transmission scheme can be decomposed as 



H, 



QR , 



(22) 



where R is anxn matrix with integer entries and Q is a to x n 
matrix, which is better conditioned than He, i e. the columns 
of Q are less correlated and shorter. A good indication for 
the correlation of a matrix is the so called condition number, 
which is defined as the ratio of the largest singular value of 
the matrix to the smallest. Using (l22l i. the equivalent signal 
model is then given as 



Now, by multiplying Q ^ from left to y we arrive at 

y = z + Q In , 

where the noise enhancement and coloring is relatively small, 
since Q^^ is also good conditioned. In order to get a estima- 
tion for the transmitted symbols, the following operation has 
to be applied 



x = C R-iQ 



1 . 



R^l. 



1 

2 In 



(23) 



where 1„ is a n x 1 vector of ones, C is a constant 
given as C = ^J'j^j and Q^" [•] describes the component- 
wise quantization with respect to the infinite integer space 
Z. However, this quantization can only be applied, if the 
transmit modulation signal set C is transformed to Z, which 
is achieved with the scaling and shifting of y within the 
quantization operation in ( |23] |. Note that after this quantization, 
re-scaling and re-shifting, some points may lie outside the 
constellation. A suboptimal solution is to assign these points 
to the nearest point within the constellation. For BPSK, the 
effect of this assignment has a significant effect on the error 
rate performance, however, this gain diminishes with higher 
order modulations. 

E. Condition number 

For illustration, the probability density functions (pdfs) of 



3.5 
3 
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Fig. 2. Pdfs of channel cond. numbers with SM or the stacked OSTBC with 
and w/o LR for a 4 X 4 system. 

the natural logarithm of the condition number of the channels 
for the stacked QSTBC and SM are depicted in Fig. [D From 
the Fig., we observe that the SM-channel is bad-conditioned 
and that LR has a great impact on the channel. For the stacked 
OSTBC, we observe that the impact of LR is not as significant 
as for SM. 

The pdf of the natural logarithm of the condition number 
for the QSTBC is depicted in Fig. [3] For comparison, the pdf 
for the stacked OSTBC is also plotted. In case of QSTBC, 
for some channels we have no gain with LR, since many 
samples of the equivalent channel generated with QSTBC have 
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inherently low condition numbers such that the LR has no 
effect. Different from the QSTBC, for the stacked OSTBC 
there is a gain achieved by applying the LR for almost 
all samples of the equivalent channel model. Note that for 
orthogonal channels (e.g., with OSTBC), the pdf is a dirac 
impulse at position 0. 



Stacked OSTBC-LR 
Stacked OSTBC 
QSTBG-LR 
QSTBC 




1.5 



Fig. 3. Pdfs of channel cond. numbers with the stacked OSTBC or QSTBC 
with and w/o LR. 



VI. Simulations 

In Fig. m the average rate of the stacked Alamouti scheme 
and the ergodic capacity of a MIMO system with nn = 2 and 
riT — 2,4 and nj- = 8 is depicted. In case of nx = 2, we 
have the standard Alamouti scheme. From the Fig., we observe 
that the difference between the average rate of the stacked 
Alamouti scheme and the capacity diminishes significantly by 
increasing the number of transmit antennas. 



14 



12 

10- 



6- 



2 




- + - n^=2. Capacity 
I n^=2, Stacked OSTBC 
- n^=4. Capacity 
n^=4, Stacked OSTBC 
. x - n^=8, Capacity 



- n^=8. Stacked OSTBC 



-10 



5 

SNR [dB] - 



10 



15 



20 



Fig. 4. Ergodic capacity and average rates of the stacked OSTBC with 
rin = 2 receive and ut = 2,nT = 4 and = 8 transmit antennas. 

In Fig. |5j the average rate of the stacked Alamouti scheme 
and the ergodic capacity with n-r ~ 4 and rij^ = 2,4 and tit = 
8 is depicted. In contrast to the case of increasing number of 



30 



25 



- + - np=2. Capacity 

-1— np=2. Stacked OSTBC 

- e - ripj=4, Capacity 
-o— np=4. Stacked OSTBC 



X 




5 
SNR [dB] - 



Fig. 5. Ergodic capacity and average rates of the stacked OSTBC with 
riT = 4 transmit and n^j = 2,nfl = 4 and ri/j = 8 receive antennas. 



transmit antennas, here we observe that the difference between 
the average rate of the stacked Alamouti scheme and the 
ergodic capacity increases by increasing the number of receive 
antennas. 

In Fig.|6] the ratio C/RsA is depicted for tit — 8 transmit 
and riji — 2 (bottom) to nj,; = 9 (top) receive antennas. For 
high SNR, we observe that as long as ut > 2n/j the ratio 
decreases as the SNR increases. In case tit < 2nii the ratio 
increases steadily. As derived in section lTV-BI the ratio is upper 
bounded by C/RsA < 2 for any tir, tit- 




10 20 
SNR [dB] 

Fig. 6. Ratio C/RsA for ut = 8 transmit and nn = 2 (bottom) to nn = 9 
(top) receive antennas. 

In Fig. |7] the ratio C / RsA is depicted for tit — 8 transmit 
and TiR = 4, nu = 6 and n^i — 9 receive antennas. In addition 
to that, we used our lower and upper bounds derived in the 
previous section in order to derive lower and upper bounds for 
the ratio C/RsA, i-C- 



Cib 

^sA 



< 



c 



Rs 



< 



(24) 
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Based on the derivations in section IIV-BI we know that the 
ratio is upper bounded by 2. Further, since the trivial lower 
bound is equal to 1, we only depicted 1 < C / RsA < 2. For 
nn — 9, we observe that both the lower and upper bound are 
getting tighter for higher SNR. At low SNR, the upper bound 
performs better than the lower bound. For = 4, nn = 6 
and low SNR, we observe that the upper bound is quite loose 
in comparison to = 9. The lower bound for n/; = 4 is not 
depicted here, since it is lower than the trivial lower bound of 
1. 



10" 




10 

SNR [dB] - 



Fig. 7. Ratio C/RgA for "T 
njj = 9 receive antennas. 



8 transmit and tir = 4, nji = 6 to 



In Fig. [8] the absolute loss A is depicted for ut = 6 
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Fig. 8. Absolute loss A for = 6 transmit and different numbers of 
receive antennas. 



transmit antennas and nn = 2 — 4 and n/j = 7 receive 
antennas. From the figure, we observe that as long as ut > 
2n fj, the slope of the absolute loss tends to a constant for high 
SNR. This behavior is tracked quite well by the bound in ( |20] |. 
which is also depicted in the figure. 

In Fig.m the BER of the stacked OSTBC with QAM and the 
QSTBC with 16-QAM is depicted for a transmission rate of 
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Fig. 9. BER for QSTBC and the stacked OSTBC with ML and LR-ZF, 4 
bit/sec/Hz. 



4 bits/sec/Hz. Note that in order to make a fair comparison of 
the three transmission schemes (i.e. QSTBC, SM, and stacked 
OSTBC), we analyzed a system with ut — nji = 4 antennas, 
since for SM with suboptimal detectors it is necessary that 
riR > TIT- From the figure, we observe, that the performance 
of the stacked OSTBC with LR-ZF detection is comparable 
with the optimal ML detection. In fact, the diversity gain of 
both detectors is equal and there is only a power penalty of 
about 1.7dB of LR-ZF to ML. The gap between ML and LR- 
ZF detection is even smaller for QSTBC. Here, the power 
penalty is about 0.6dB. Interestingly, the performance of the 
stacked OSTBC for both ML and LR-ZF detection is better 
than that of QSTBC in the SNR region shown in the figure. 
However, for very high SNR and low BER, the diversity gain 
of nTUR (contrary to diversity of 2nii for the stacked OSTBC) 
for the QSTBC will show its effect and in can be expected that 
the performance of QSTBC gets better than that of the stacked 
OSTBC. For smaller n^j, this intersection point is expected be 
at lower SNR values. 

The bit error-rate performance of SM for BPSK and a 
transmission rate of 4 bits/sec/Hz is shown in Fig. [TO] For 
comparison purposes, we also plotted the BER of the stacked 
QSTBC with QAM. Here, we observe that the BER perfor- 
mance with ML-detection of the stacked OSTBC is better than 
that of SM for all SNR values. In case of LR-ZF detection, SM 
performs only better than QSTBC for low SNR of about 2dB. 
However, the gap in power efficiency between ML and LR-ZF 
is higher for the stacked QSTBC in comparison to SM with 
BSPK. Note that (as aforementioned) the small gap for SM 
is only due to the BPSK modulation. For higher modulation 
sizes, this gap is even higher. By increasing the transmission 
rate to 8bit/sec/Hz, i.e. QAM for SM and 16QAM for the 
stacked OSTBC, we observe in Fig. [TT] that the gap between 
ML and LR-ZF is dramatically increased in case of SM to 
about 6dB. On the other hand, the gap between ML and LR-ZF 
for the stacked OSTBC and 16QAM is reduced in comparison 
to the gap achieved with QAM (cf. Fig. [TOll to about L3dB. 
Although the performance of SM with ML detection is better 
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Fig. 10. BER for SM and stacked OSTBC with ML and LR-ZF. 4 bit/sec/Hz. 

than that of the stacked OSTBC for low and moderate SNR 
values, for high SNR values it is the other way around. The 




5 10 15 20 25 

SNR [dB] ^ 



Fig. 1 1 . BER for SM and stacked OSTBC with ML and LR-ZF, 8bit/sec/Hz. 

performance of the stacked OSTBC with LR-ZF detection 
is better for the whole SNR range in comparison to SM, 
which is of higher interest for practical applications, since the 
computational complexity of the ML detector is exponential 
in the transmission rate. Another disadvantage of SM is that 
we need at least as many receive as transmit antennas, i.e. 
riT < riR, whereas only ^ receive antennas are necessary 
for the stacked OSTBC. Multiple receive antennas are only 
optional for the QSTBC . 

VII. Conclusion 

In this paper, we analyzed the performance of stacked 
OSTBC in terms of the average rate. We showed, that the 
stacked scheme achieves the capacity of a MIMO system in 
the case of n/j = 1 receive antennas. Further, we showed that 
the MIMO capacity is at most twice the rate achieved with the 
proposed scheme at any SNR. We derived lower and upper 



bounds for the rate achieved with this scheme and compared 
it with upper and lower bounds for the capacity. 

In addition to the capacity analysis, we also analyzed the 
error rate performance of the proposed scheme. To this end, we 
combined the stacked OSTBC with a zero-forcing (ZF) detec- 
tor applying lattice-reduction (LR) aided detection, since this 
suboptimal detector achieves the same diversity as the optimal 
ML detector with only some penalty in power efficiency. We 
analyzed the effect of LR on the equivalent channel generated 
by the stacked OSTBC, for spatial multiplexing (SM) and 
QSTBC. We observed the highest gain for SM and a higher 
gain for the stacked OSTBC in comparison to the QSTBC. 

Finally, we illustrated the theoretical results by numerical 
simulations. From simulation results we observed that the 
stacked scheme approaches the ergodic capacity of a MIMO 
system by increasing the number of transmit antennas for a 
fixed number of receive antennas. Furthermore, we observed 
that as long as the number of transmit antennas is twice 
the number of receive antennas the ratio of the capacity to 
the rate of the proposed scheme improves by increasing the 
SNR. Regarding the simulation of the error rate performance, 
we observed that in the considered SNR region the stacked 
OSTBC performs better in terms of BER for ML as well as 
for LR-aided ZF-detection than SM and QSTBC in the setup 
given. Further, we observed that the gap between maximum- 
likelihood and LR-ZF detection is dramatically reduced in 
comparison to SM schemes, especially for higher transmission 
rates. 
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